Conceptual Representation Using WordNet for Text Categorization
نویسندگان
چکیده
منابع مشابه
Using WordNet for Text Categorization
This paper explores a method that use WordNet concept to categorize text documents. The bag of words representation used for text representation is unsatisfactory as it ignores possible relations between terms. The proposed method extracts generic concepts from WordNet for all the terms in the text then combines them with the terms in different ways to form a new representative vector. The effe...
متن کاملText Representation for Automatic Text Categorization
Automatic Text Categorization (ATC), the automatic assignment of text documents to predefined classes, is a language engineering task very relevant to a number of applications, including automatic content and knowledge management in corporations and the Internet, information access and filtering, etc. With first works dating back to 60’s [14], and increased work in the last decade (see the surv...
متن کاملUsing WordNet to Complement Training Information in Text Categorization
Automatic Text Categorization (TC) is a complex and useful task for many natural language applications, and is usually performed through the use of a set of manually classified documents, a training collection. We suggest the utilization of additional resources like lexical databases to increase the amount of information that TC systems make use of, and thus, to improve their performance. Our a...
متن کاملText Categorization and Information Retrieval Using WordNet Senses
In this paper we study the influence of semantics in the Text Categorization (TC) and Information Retrieval (IR) tasks. The K Nearest Neighbours (K-NN) method was used to perform the text categorization. The experimental results were obtained taking into account for a relevant term of a document its corresponding WordNet synset. For the IR task, three techniques were investigated: the direct us...
متن کاملText Representation with WordNet Synsets using Soft Sense Disambiguation
Text information processing depends critically on the proper representation of texts. A common and naive way of representing a text is as a bag of its component words. This representation suffers primarily from two drawbacks, viz., polysemy and synonymy which arise because of the ambiguity of the words and the lack of information about the relations between the words. This paper presents a mode...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer and Communication Engineering
سال: 2014
ISSN: 2010-3743
DOI: 10.7763/ijcce.2014.v3.286